首页> 外文OA文献 >Data quality in the human and environmental health sciences: Using statistical confidence scoring to improve QSAR/QSPR modeling
【2h】

Data quality in the human and environmental health sciences: Using statistical confidence scoring to improve QSAR/QSPR modeling

机译:人类和环境健康科学中的数据质量:使用统计置信度评分改善QSAR / QSPR建模

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

A greater number of toxicity data are becoming publicly available allowing for in silico modeling. However, questions often arise as how to incorporate data quality and how to deal with contradicting data if more than a single datum point is available for the same compound. In this study, two well-known and studied QSAR/QSPR models for skin permeability and aquatic toxicology have been investigated in the context of statistical data quality. In particular, the potential benefits of the incorporation of the statistical Confidence Scoring (CS) approach within modelling and validation. As a result, robust QSAR/QSPR models for the skin permeability coefficient and the toxicity of nonpolar narcotics to Aliivibrio fischeri assay were created. CSweighted linear regression for training and CS-weighted root mean square error (RMSE) for validation were statistically superior compared to standard linear regression and standard RMSE. Strategies are proposed as to how to interpret data with high and low CS, as well as how to deal with large datasets containing multiple entries.
机译:越来越多的毒性数据可公开获得,从而可以进行计算机模拟。但是,如果同一化合物有多个基准点,那么如何合并数据质量以及如何处理矛盾的数据经常会引起问题。在这项研究中,已经在统计数据质量的背景下研究了两个众所周知的皮肤渗透性和水生毒理学的QSAR / QSPR模型。特别是在建模和验证中纳入统计置信评分(CS)方法的潜在好处。结果,针对皮肤通透性系数和非极性麻醉药对费氏Aliivibrio fischeri分析的毒性,建立了健壮的QSAR / QSPR模型。与标准线性回归和标准RMSE相比,用于训练的CS加权线性回归和用于验证的CS加权均方根误差(RMSE)在统计学上更高。提出了有关如何解释具有较高和较低CS的数据以及如何处理包含多个条目的大型数据集的策略。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号